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O ' Abstract 

o : 

CSJ , The detection and estimation of signals in noisy, limited data is a problem of interest to many scientific and 

engineering communities. We present a mathematically justifiable, computationally simple, sample eigenvalue based 
, procedure for estimating the number of high-dimensional signals in white noise using relatively few samples. The 

' main motivation for considering a sample eigenvalue based scheme is the computational simplicity and the robustness 

to eigenvector modelling errors which are can adversely impact the performance of estimators that exploit information 
' in the sample eigenvectors. 

There is, however, a price we pay by discarding the information in the sample eigenvectors; we highlight a 
I— I ' fundamental asymptotic limit of sample eigenvalue based detection of weak/closely spaced high-dimensional signals 

, from a limited sample size. This motivates our heuristic definition of the effective number of identifiable signals 

C/^ ■ which is equal to the number of "signal" eige nvalues of the populat ion covariance matrix which exceed the noise 

• 1 £ . ^-^1 ^^uii/ Dimensionality of the system 

• variance by a factor strictly greater than 1 + W Sample size — ■ 

I The fundamental asymptotic limit brings into sharp focus why, when there are too few samples available so 

^ ■ that the effective number of signals is less than the actual number of signals, underestimation of the model order 

I— — 1| is unavoidable (in an asymptotic sense) when using any sample eigenvalue based detection scheme, including the 
one proposed herein. The analysis reveals why adding more sensors can only exacerbate the situation. Numerical 

" I simulations are used to demonstrate that the proposed estimator, like Wax and Kailath's MDL based estimator, 

ly-^ ■ consistently estimates the true number of signals in the dimension fixed, large sample size limit and the effective 

I number of identifiable signals, unlike Wax and Kailath's MDL based estimator, in the large dimension, (relatively) 

' large sample size limit. 

(N ; 

^ ; EDICS Category: SSP-DETC Detection; SAM-SDET Source detection 

i> : 

O ■ I- Introduction 

J> . The observation vector, in many signal processing applications, can be modelled as a superposition of a finite 
^ , number of signals embedded in additive noise. Detecting the number of signals present becomes a key issue and is 
$^ ■ often the starting point for the signal parameter estimation problem. When the signals and the noise are assumed, 
as we do in this paper, to be samples of a stationary, ergodic Gaussian vector process, the sample covariance matrix 
formed from m observations has the Wishart distribution [1]. This paper uses an information theoretic approach, 
inspired by the seminal work of Wax and Kailath [2], for determining the number of signals in white noise from 
the eigenvalues of the Wishart distributed empirical covariance matrix formed from relatively few samples. 

The reliance of the Wax and Kailath estimator and their successors [3]-[7], to list a few, on the distributional 
properties of the eigenvalues of non-singular Wishart matrices render them inapplicable in high-dimensional, sample 
starved settings where the empirical covariance matrix is singular. Ad-hoc modifications to such estimators are often 
not mathematically justified and it is seldom clear, even using simulations as in [5], whether a fundamental limit 
of detection is being encountered vis a vis the chronically reported symptom of underestimating the number of 
signals. 

This paper addresses both of these issues using relevant results [8]-[ll] from large random matrix theory. The 
main contributions of this paper are 1) the development of a mathematically justified, computationally simple, 
sample eigenvalue based signal detection algorithm that operates effectively in sample starved settings and, 2) the 
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introduction of the concept of effective number of (identifiable) signals which brings into sharp focus a fundamental 
limit in the identifiability, under sample size constraints, of closely spaced/low level signals using sample eigenvalue 
based detection techniques of the sort developed in this paper. 

The proposed estimator exploits the distributional properties of the trace of powers, i.e., the moments of the 
eigenvalues, of (singular and non-singular) Wishart distributed large dimensional sample covariance matrices. The 
definition of effective number of identifiable signals is based on the mathematically rigorous results of Baik- 
Silverstein [12], Paul [13] and Baik et al [14] and the heuristic derivation of the first author [15]. This concept 
captures the fundamental limit of sample eigenvalue based detection by explaining why, in the large system relatively 
large sample size limit, if the signal level is below a threshold that depends on the noise variance, sample size 
and the dimensionality of the system, then reliable sample eigenvalue based detection is not possible. This brings 
into sharp focus the fundamental undetectability of weak/closely spaced signals using sample eigenvalue based 
schemes when too few samples are available. Adding more sensors will only exacerbate the problem by raising the 
detectability threshold. 

Conversely, if the signal level is above this threshold, and the dimensionality of the system is large enough, 
then reliable detection using the proposed estimator is possible. We demonstrate this via numerical simulations 
that illustrate the superiority of the proposed estimator with the respect to the Wax-Kailath MDL based estimator. 
Specifically, simulations reveal that while both the new estimator and the Wax-Kailath MDL estimator are consistent 
estimators of the number of signals k in the dimensionality n fixed, sample size m ^ oo sense, the MDL estimator 
is an inconsistent estimator of the effective number of signals in the large system, large sample size limit, i.e., in 
n, m{n) — > oo limit where the ratio n/m{n) — > c G (0, oo) sense. Simulations suggest that the new estimator is a 
consistent estimator of the effective number of signals in the n, m{n) oo with n/m{n) — > c G (0, oo) sense. We 
note that simulations will demonstrate the applicability of the proposed estimator in moderate dimensional settings 
as well. 

The paper is organized as follows. The problem formulation in Section |II]is followed by a summary in Section Hill 
of the relevant properties of the eigenvalues of large dimensional Wishart distributed sample covariance matrices. 
An estimator for the number of signals present that exploits these results is derived in Section |IVl An extension 
of these results to the frequency domain is discussed in Section jV] Consistency of the proposed estimator and the 
concept of effective number of signals is discussed in Section |Vll Simulation results that illustrate the superior 
performance of the new method in high dimensional, sample starved settings are presented in Section IVIIt some 
concluding remarks are presented in Section I VIII I 




II. Problem formulation 

We observe m samples ("snapshots") of possibly signal bearing n-dimensional snapshot vectors xi,...,Xm 
where for each i, Xj ~ A/'n(0, R) and Xj are mutually independent. The snapshot vectors are modelled as 

No Signal 

for i = 1, . . . , m, (1) 

Signal Present 

where Zj ~ A/'n(0, cr^I), denotes an ?i-dimensional (real or circularly symmetric complex) Gaussian noise vector 
where a'^ is assumed to be unknown, Sj ~ A/A,.(0,R<i) denotes a A;-dimensional (real or circularly symmetric 
complex) Gaussian signal vector with covariance R^, and A is a ?i x /c unknown non-random matrix. In array 
processing applications, the j-th column of the matrix A encodes the parameter vector associated with the j-th 
signal whose magnitude is described by the j-the element of Sj. 

Since the signal and noise vectors are independent of each other, the covariance matrix of Xj can be decomposed 

as 

R = + CT^I (2) 

where 

* = AR,A', (3) 

with ' denoting the conjugate transpose. Assuming that the matrix A is of full column rank, i.e., the columns of A 
are linearly independent, and that the covariance matrix of the signals R^ is nonsingular, it follows that the rank 
of * is k. Equivalently, the n — k smallest eigenvalues of * are equal to zero. 
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If we denote the eigenvalues of R by Ai > A2 > . . . > A„ then it follows that the smallest n — k eigenvalues 
of R are all equal to cj^ so that 

Afc+i = Xk+2 = . . . = A„ = A = cr^. (4) 

Thus, if the true covariance matrix R were known apriori, the dimension of the signal vector k can be determined 
from the multiplicity of the smallest eigenvalue of R. When there is no signal present, all the eigenvalues of R will 
be identical. The problem in practice is that the covariance matrix R is unknown so that such a straight-forward 
algorithm cannot be used. The signal detection and estimation problem is hence posed in terms of an inference 
problem on m samples of n-dimensional multivariate real or complex Gaussian snapshot vectors. 

Inferring the number of signals from these m samples reduces the signal detection problem to a model selection 
problem for which there are many approaches. A classical approach to this problem, developed by Bartlett [16] 
and Lawley [17], uses a sequence of hypothesis tests. Though this approach is sophisticated, the main problem is 
the subjective judgement needed by the practitioner in selecting the threshold levels for the different tests. 

Information theoretic criteria for model selection such as those developed by Akaike [18], [19], Schwartz [20] and 
Rissanen [21] address this problem by proposing the selection of the model which gives the minimum information 
criteria. The criteria for the various approaches is generically a function of the log-likelihood of the maximum 
likelihood estimator of the parameters of the model and a term which depends on the number of parameters of the 
model that penalizes overfitting of the model order. 

For the problem formulated above, Kailath and Wax [2] propose an estimator for the number of signals {assuming 
m > n and Xj G C") based on the eigenvalues h > h > ■ ■ ■ > In of the sample covariance matrix (SCM) defined 
by 

- 1 1 

R = — y Xix^ = —XX' (5) 



where X = [xi| . . . jx^] is the matrix of observations (samples). The Akaike Information Criteria (AIC) form of 
the estimator is given by 

/CAIC = argmin— 2(n — A;)mlog —-rr + 2k(2n — k) for A; G N : < /c < n (6) 

a{k) 

while the Minimum Descriptive Length (MDL) criterion is given by 

q(k) 1 

^MDL = argmin — (n — A;)mlog — ttt- H — k(2n — k) logm for fc G N : < /c < n (7) 

a[k) 2 

where g{k) = nj=fc+i ^j^^" ^'^ ^^'^ geometric mean of the n — k smallest sample eigenvalues and a{k) = 
Sj=/fc+i h thek arithmetic mean. 

It is known [2] that the AIC form inconsistently estimates the number of signals, while the MDL form estimates 
the number of signals consistently in the classical n fixed, m ^ 00 sense. The simplicity of the estimator, and 
the large sample consistency are among the primary reasons why the Kailath-Wax MDL estimator continues to 
be employed in practice [22]. In the two decades since the publication of the WK paper, researchers have come 
up with many innovative solutions for making the estimators more "robust" in the sense that estimators continue 
to work well in settings where the underlying assumptions of snapshot and noise Gaussianity and inter-snapshot 
independence (e.g. in the presence of multipath) can be relaxed as in the work of Zhao et al [23], [24], Xu et al 
[3], and Stoica-Cedervall [25]among others [26]. 

Despite its obvious practical importance, the robustness of the algorithm to model mismatch is an issue that we 
shall not address in this paper. Instead we aim to revisit the original problem considered by Wax and Kailath with 
the objective of designing an estimator that is robust to high-dimensionality and sample size constraints. We are 
motivated by the observation that the most important deficiency of the Wax-Kailath estimator and its successors 
that is yet to be satisfactorily resolved occurs in the setting where the sample size is smaller than the number of 
sensors, i.e., when ?n < n, which is increasingly the case in many state-of-the-art radar and sonar systems where 
the number of sensors exceeds the sample size by a factor of 10 — 100 [27]. In this situation, the SCM is singular 
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and the estimators become degenerate, as seen in Q and (|7]l. Practitioners often overcome this in an ad-hoc fashion 
by, for example, restricting in (|7]l to integer values in the range < < min(n, m) so that 



k 



MDi: = argmm- 



-{n — k)m log — 7TT- + 7T^(2n — A;) log m 



for G N : < A; < min(n, m) 



(8) 



Since large sample, i.e., m ^ n, asymptotics [28] were used to derive the estimators in [2], there is no rigorous 
theoretical justification for such a reformulation even if the simulation results suggest that the WK estimators are 
working "well enough." 

This is true for other sample eigenvalue based solutions found in the literature that exploit the sample eigenvalue 
order statistics [6], [7], employ a Bayesian framework by imposing priors on the number of signals [29], involve 
solving a set of possibly high-dimensional non-linear equations [3], or propose sequential hypothesis testing 
procedures [4]. The fact that these solutions are computationally more intensive or require the practitioner to 
set subjective threshold levels makes them less attractive than the WK MDL solution; more importantly, they do 
not address the sample starved setting in their analysis or their simulations either. 

For example, in [7], Fishier et al use simulations to illustrate the performance of their algorithm with n = 7 
sensors and a sample size m > 500 whereas in a recent paper [26], Fishier and Poor illustrate their performance 
with n = 10 sensors and m > 50 samples. Van Trees discusses the various techniques for estimating the number 
of signals in Section 7.8 of [22]; the sample starved setting where m = 0{n) or m < n is not treated in the 
simulations either. 

There is however, some notable work on detecting the number of signals using short data records. Particle filter 
based techniques [30], have proven to be particularly useful in such short data record settings. Their disadvantage, 
from our perspective, is that they require the practitioner to the model the eigenvectors of the underlying population 
covariance matrix as well; this makes them especially sensitive to model mismatch errors that are endemic to 
high-dimensional settings. 

This motivates our development of a mathematically justifiable, sample eigenvalue based estimator with a 
computational complexity comparable to that of the modified WK estimator in ([8]) that remains robust to high- 
dimensionality and sample size constraints. The proposed new estimator given by: 



[n 



k) 



El 



k+l 



I' 



n 

1 + — 

m 



n 



2 \ n 

(3 J m 



/cnew 

Here ^9 = 1 if Xj e M", and (3 



argmm < — 
k I 4 



/3 



m" 
n . 

2 if Xj G 



/2 



+ 2(A: + 1) for A: G N : < < min(n, m). 



(9a) 



(9b) 



In Q, the /j's are the eigenvalues of the sample covariance matrix R. An implicit assumption in the derivation of 
^ is that the number of signals is much smaller than the system size, i.e., k <^ n. A rather important consequence 
of our sample eigenvalue based detection scheme, as we shall elaborate in Section [VTl is that it just might not be 
possible to detect low level/closely spaced signals when there are too few samples available. 

To illustrate this effect, consider the situation where there are two uncorrelated (hence, independent) signals so 
that Us = diag((jgj^, (7^2)- ([Hi let A = [V1V2], as in a sensor array processing application so that vi = v(0i) and 
V2 = V2(02) encode the array manifold vectors for a source and an interferer with powers a'^^ and cjg2, located at 
9i and 62, respectively. The covariance matrix is then given by 



R = CTgiViv'i + 0-32 V2V2 + cr^I. 



(10) 



In the special situation when || vi || = || V2 



and a'^i 



(Tg2 = CTg, we can (in an asymptotic sense) reliably 



detect the presence of both signals from the sample eigenvalues alone whenever 



Asymptotic identifiability condition : 



(Vl, V2)| 



n 
m 



(11) 



If the signals are not strong enough or not spaced far enough part, then not only will proposed estimator consistently 
underestimate the number of signals but so will any other sample eigenvalue based detector. 
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The concept of the effective number of signals provides insight into the fundamental limit, due to snapshot 
constraints in high-dimensional settings, of reliable signal detection by eigen-inference , i.e., by using the sample 
eigenvalues alone. This helps identify scenarios where algorithms that exploit any structure in the eigenvectors 
of the signals, such as the MUSIC and the Capon-MVDR algorithms in sensor array processing [22] or particle 
filter based techniques [30], might be better able to tease out lower level signals from the background noise. It 
is worth noting that the proposed approach remain relevant in situations where the eigenvector structure has been 
identified. This is because eigen-inference methodologies are inherently robust to eigenvector modelling errors that 
occur in high-dimensional settings. Thus the practitioner may use the proposed methodologies to complement and 
"robustify" the inference provided by algorithms that exploit the eigenvector structure. 

III. Pertinent results from random matrix theory 

Analytically characterizing the distribution of the sample eigenvalues, as a function of the population eigenvalues, 
is the first step in designing a sample eigenvalue based estimator that is robust to high-dimensionality and sample 
size constraints. For arbitrary covariance R, the joint density function of the eigenvalues /i, . . . , of the SCM R 
when m > n + 1 is shown to be given by [28] 

fih, ...x) = E /f n - ^.1' n '^^^ (-^ ^ (r^^QRQ') ) (12) 

where /i >...>/„> 0, Zn^m is a normalization constant, and /5 = 1 (or 2) when R is real (resp. complex). In 
(fT2l) . Q G 0(n) when [3=1 while Q G U(n) when (3 = 2 where 0(n) and U(n) are, respectively, the set of 
n X n orthogonal and unitary matrices with Haar measure. 

Note that the exact characterization of the joint density of the eigenvalues in (fT2]) involves a multidimensional 
integral over the orthogonal (or unitary) group. This makes it intractable for analysis without resorting to asymptotics. 
Anderson's landmark paper [28] does just that by characterizing the distribution of the sample eigenvalues using 
large sample asymptotics. A classical result due to Anderson establishes the consistency of the sample eigenvalues 
in the dimensionality n fixed, sample size m —>■ oo asymptotic regime [28]. When the dimensionality is small and 
there are plenty of samples available, Anderson's analysis suggests that the sample eigenvalues will be (roughly) 
symmetrically centered around the population eigenvalues. When the dimensionality is large, and the sample size 
is relatively small, Anderson's prediction of sample eigenvalue consistency is in stark contrast to the asymmetric 
spreading of the sample eigenvalues that is observed in numerical simulations. This is illustrated in Figure [T] where 
the n = 20 eigenvalues of a SCM formed from m = 20 samples are compared with the eigenvalues of the underlying 
population covariance matrix. 

The role of random matrix theory comes in because of new analytical results that are able to precisely describe the 
spreading of the sample eigenvalues exhibited in Figure[T] Since our new estimator explicitly exploits these analytical 
results, the use of our estimator in high-dimensional, sample starved settings is more mathematically justified than 
other sample eigenvalue based approaches found in the literature that explicitly use Anderson's sample eigenvalue 
consistency results. See, for example [2, Eq. (13a), pp. 389], [6, Eq (5)., pp. 2243]. 

We argue that this is particularly so in settings where m < n, where practitioners (though, not the original 
authors!) have often invoked the equality between the non-zero eigenvalues of the R = (l/?n)XX' and the matrix 
(l/m)X'X to justify ad-hoc modifications to estimators that use only the non-zero eigenvalues of the SCM, as 
in dHll. We contend that such an ad-hoc modification to any estimator that explicitly uses Anderson's sample 
eigenvalue consistency results is mathematically unjustified because the sample eigen-spectrum blurring, which is 
only exacerbated in the m < n regime, remains unaccounted for. 

Before summarizing the pertinent results, we note that the analytical breakthrough is a consequence of considering 
the large system size, relatively large sample size asymptotic regime as opposed to "classical" fixed system size, 
large sample size asymptotic regime. Mathematically speaking, the new results describe the distribution of the 
eigenvalues in n, m ^ oo with n/m—>cG (O,cxo) asymptotic regime as opposed to the n fixed, m —>■ oo regime 
a la Anderson. We direct the reader to Johnstone's excellent survey for a discussion on these asymptotic regimes 
[31, pp. 9] and much more. 
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(b) Sample eigenvalues formed from 20 snapshots. 



Fig. 1: Blurring of sample eigenvalues relative to the population eigenvalues when are there a finite number of 
snapshots. 



A. Eigenvalues of the signal-free SCM 

A central object in the study of large random matrices is the empirical distribution function (e.d.f.) of the 
eigenvalues, which for an arbitrary matrix A with n real eigenvalues (counted with multiplicity), is defined as 

„ A , , Number of eigenvalues of A < x 

F^(x) = —. (13) 

n 

For a broad class of random matrices, the sequence of e.d.f.'s can be shown to converge in the n ^ oo limit to a 
non-random distribution function [32]. Of particular interest is the convergence of the e.d.f. of the signal-free SCM 
which is described next. 

Proposition 3.1: Let R denote a signal-free sample covariance matrix formed from an n x m matrix of obser- 
vations with i.i.d. Gaussian samples of mean zero and variance A = a^. Then the e.d.f. F^{x) F^{x) almost 
surely for every x, as m, n — > oo and Cm = n/m ^ c where 

dF^(x) = max (o, (l - i)) 6{x) + ,^^,(^) dx, (14) 

with a± = A(l ± \/c)^, I[afi]{x) = 1 when a < x < b and zero otherwise, and 6{x) is the Dirac delta function. 

Proof: This result was proved in [33], [34] in very general settings. Other proofs include [9], [35], [36]. The 
probability density in ([141 ) is often referred to as the Marcenko-Pastur density. ■ 
Figure [2] plots the Marcenko-Pastur density in ([T4b for A = 1 and different values of c = lim n/m. Note that as 
c — > 0, the eigenvalues are increasingly clustered around x = 1 but for modest values of c, the spreading is quite 
significant. 

The almost sure convergence of the e.d.f. of the signal-free SCM implies that the moments of the eigenvalues 
converge almost surely, so that 

-j^li^ I x^dF'^ix) =: Mf . (15) 
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X 

Fig. 2: The Marcenko-Pastur density, given ([T4l ). for the eigenvalues of the signal-free sample covariance matrix 
with noise variance 1 and c = lim n/m. 



The moments of the Marcenko-Pastur density are given by [9], [37] 

fc-i 



k\ fk-l 
3 



(16) 



For finite n and m, the sample moments, i.e., ^Yll=i^^ will fluctuate about these limiting values. The precise 
nature of the fluctuations is described next. 

Proposition 3.2: If R satisfies the hypotheses of Proposition 13.11 for some A then as m, n — > oo and Cm = 

n/m ^ c G (0, oo), then 



1 y-n ,2 



A2(l + c) 



I 



\ 



(l-l)A2c 



2 

'/5 



A^c 2A3c(c+1) 
2A3c(c+1) 2A''c (2c2 + 5c + 2) 



(17) 



=^0 



=Q 



where the convergence is in distribution. 

Proof: This result appears in [8], [9] for the real case and in [10] for the real and complex cases. The result 
for general /3 appears in Dumitriu and Edelman [11]. ■ 

We now use the result in Proposition 13.21 to develop a test statistic g„ whose distribution is independent of the 
unknown noise variance A. The distributional properties of this test statistic are described next. 

Proposition 3.3: Assume R satisfies the hypotheses of Proposition 13. 1 1 for some A. Consider the statistic 



ill Y^i=l ^i) 



Then as m, n ^ co and = n/m — > c G (0, oo), 



nb„-(l + c)| -^M 



Uc-c- 



where the convergence is in distribution. 

Proof: Define the function g{x,y) = y/x^. Its gradient vector Wg{x,y) is given by 

Vg[x, y) := [d^g{x, y) dyg{x, y)f = [-2y/x^ l/x^- 



(18) 



(19) 
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The statistic g„ can be written in terms of g{x, y) as simply g„ = ZlILi n Yll=i ^i)- The limiting distribution 
of Qn can be deduced from the distributional properties of ^ k and ^ X]"=i li established in Proposition 13.21 
Specifically, by an application of the delta method [38], we obtain that as n, m ^ oo with n/m ^ c G (0, oo), 

n [qn - giX, X\l + c))] ^ AA(^„ a^) 

where the mean fiq and the variance o"^ are given by 

fig = tilVg{X,X\l + c)), (20a) 



2- V5(A,A2(1 + c))'^QV5(A,A2(1 + c)) (20b) 



<7„ 



Substituting, the expressions for /iq and Q given in ([TT] ). in (I20al ) and (I20bl ) gives us the required expressions for 



the mean and the variance of the normal distribution on the right hand side of ([T8 



B. Eigenvalues of the signal bearing SCM 

When there are k signals present then, in the n ^ oo limit, where k is kept fixed, the limiting e.d.f. of R will 
still be given by Proposition 13.11 This is because the e.d.f., defined as in ([T3] ). weights the contribution of every 
eigenvalue equally so that effect of the k/n fraction of "signal" eigenvalues vanishes in the ?7, ^ cxo limit. 

Note, however, that in the signal-free case, i.e., when k = 0, Proposition 13. II and the result in [39] establish the 
almost sure convergence of the largest eigenvalue of the SCM to A(l + \/c)^. In the signal bearing case, a so-called 
phase transition phenomenon is observed, in that the largest eigenvalue will converge to a limit different from that 
in the signal-free case only if the "signal" eigenvalues are above a certain threshold. This is described next. 

Proposition 3.4: Let R denote a sample covariance matrix formed from aniixm matrix of Gaussian observations 
whose columns are independent of each other and identically distributed with mean and covariance R. Denote 



the eigenvalues of R by Ai > A2 > • 
Then as n,m ^ 00 with Cm = n/m 



> Xk > Afc+i 

> c G (0, 00), 

Ac 



... A,; 



A. Let /, denote the j-th largest eigenvalue of R. 



A, 1 + 



A, -A 



if Xj > A (1 + VH) 



(21) 



X{l + V~c? 



if 



Xj < A(l + ^) 



for j = 1, . . . , k and the convergence is almost surely. 

Proof: This result appears in [12] for very general settings. A matrix theoretic proof for the real valued SCM 
case may be found in [13] while a determinental proof for the complex case may be found in [14]. A heuristic 
derivation that relies on an interacting particle system interpretation of the sample eigenvalues appears in [15]. ■ 
For "signal" eigenvalues above the threshold described in Proposition 13. 4[ the fluctuations about the asymptotic 
limit are described next. 

Proposition 3.5: Assume that R and R satisfy the hypotheses of Proposition 13.41 If Xj > A(l + ^/c) has 
multipUcity 1 and if ^/m\c — n/m\ then 



/; - A. 1 + 



Ac 



A 



1 



(A, - A)^ 



(22) 



where the convergence in distribution is almost surely. 

Proof: A matrix theoretic proof for the real case may be found in [13] while a determinental proof for the 
complex case may be found in [14]. The result has been strengthened for non-Gaussian situations by Baik and 
Silverstein for general c G (0, 00) [40]. ■ 
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IV. Estimating the number of signals 



We derive an information theoretic estimator for the number of signals by exploiting the distributional properties of 
the moments of eigenvalues of the (signal-free) SCM given by Propositions 13 . 21 and 13 . 31 as follows. The overarching 
principle used is that, given an observation y = [y{l), . . . ,y{N)] and a family of models, or equivalently a 
parameterized family of probability densities f{y\9) indexed by the parameter vector 0, we select the model which 
gives the minimum Akaike Information Criterion (AIC) [19] defined by 



AICfc = -21og/(y|0) + 2fc 



(23) 



where 6 is the maximum UkeUhood estimate of 6, and k is the number of free parameters in 6. Since the noise 
variance is unknown, the parameter vector of the model, denoted by 6^, is given by 

[Xi,...,Xk,aY. (24) 



There are thus k + 1 free parameters in 0^. Assuming that there are k < min(?i, m) signals, the maximum likelihood 
estimate of the noise variance is given by [28] (which Proposition 13.21 corroborates in the m < n setting) 



1 " 



n — k 



(25) 



i=k+l 



where h > ■ ■ ■ > In are the eigenvalues of R. Consider the test statistic 



qk=n 



n 



1 v^n 
n—k 2^i= 



-k+1 



(k) 



+ 



1 spr 

n—k 



/2 



1 sr^n 

n—k 



i=k+l 



+ 



2 



(26) 



(27) 



for a constant c > 0. When A; > signals are present and assuming k <^ n, then the distributional properties of 
the n — k "noise" eigenvalues are closely approximated by the distributional properties of the eigenvalues given by 
Proposition 13.21 of the signal-free SCM, i.e., when /c = 0. It is hence reasonable to approximate the distribution of 
the statistic qk with the normal distribution whose mean and variance, for some c > 0, given in Proposition | 
The log-likelihood function log f{qk\0), for large n,m can hence be approximated by 



log f{qk\0) 



1 , 4 , 
+ -log2vr-c 



(28) 



Constant 



In (l26l ). and (|28I ). it is reasonable (Bai and Silverstein provide an argument in [10]) to use Cm = n/m for the 
(unknown) limiting parameter c = limn/m. Plugging in ck, Cm = n/m into (l26l ). and (1281 ). ignoring the constant 
term on the right hand side of (|28] ) when the log-likelihood function is substituted into (|23] ) yields the estimator in 
do]). Figure |3] plots sample realizations of the score function. 
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(a) Complex signals: n = 16, m — 32. 




(b) Complex signals: n — 32, m = 64. 



Fig. 3: Sample realizations of the proposed criterion when there k = 2 complex valued signals and Ai = 10, A2 = 3 
and A3 = . . . = A„ = 1. 



V. Extension to frequency domain and vector sensors 

When the m snapshot vectors :x.i{wj) for j = 1, . . . ,m represent Fourier coefficients vectors at frequency Wj 
then the sample covariance matrix 



^ m 



m 



i=l 



yLi{wj)yLi{wj)' 



(29) 



is the periodogram estimate of the spectral density matrix at frequency Wj. The time-domain approach carries over 
to the frequency domain so that the estimator in ^ remains applicable with /j = li{wj) where li{wj) > hiwj) > 
. . . > ln{wj) are the eigenvalues of R(tt;j). 

When the signals are wideband and occupy M frequency bins, denoted by wi, . . . , wm, then the information on 
the number of signals present is contained in all the bins. The assumption that the observation time is much larger 
than the correlation times of the signals (sometimes referred to as the SPLOT assumption - stationary process, long 
observation time) ensures that the Fourier coefficients corresponding to the different frequencies are statistically 
independent. 

Thus the AIC based criterion for detecting the number of wideband signals that occupy the frequency bands 
wi, . . . ,wm is obtained by summing the corresponding criterion in Q over the frequency range of interest: 



(n — k) 



1 + 



n 



m 



n 



1 



n 
m 



cnew = arg mm 

fcgN:0<fc<min(7i,»n; 



4 



[ni]\l, + 2M{k + l) 



(30a) 
(30b) 
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When the number of snapshots is severely constrained, the SPLOT assumption is Ukely to be violated so that 
the Fourier coefficients corresponding to different frequencies will not be statistically independent. This will likely 
degrade the performance of the proposed estimators. 

When the measurement vectors represent quaternion valued narrowband signals, then /3 = 4 so that the estimator 
in ^ can be used. Quaternion valued vectors arise when the data collected from vector sensors is represented using 
quaternions as in [41]. 

VI. Consistency of the estimator and the effective number of identifiable signals 

For a fixed sample size, and system dimensionality, the probability of detecting a signal is the most practically 
useful criterion for comparing the performance of different estimators. For theoretical purposes, however, the 
large sample consistency of the estimator is (usually) more analytically tractable and hence often supplied as 
the justification for employing an estimator. We conjecture that the proposed algorithm is a consistent estimator of 
the true number of signals in the "classical" large sample asymptotic regime in the sense made explicit next. 



Conjecture 6.1: Let R be a n x n covariance matrix that satisfies the hypothesis of Proposition 13.41 Let R be a 
sample covariance matrix formed from m snapshots. Then in the n fixed, m ^ oo limit, fc is a consistent estimator 
of k where k is the estimate of the number of signals obtained using 



The "classical" notion of large sample consistency does not adequately capture the suitability of an estimator in high 
dimensional, sample starved settings when m < n or m = 0{n). In such settings, it is more natural to investigate 
the consistency properties of the estimator in the large system, large sample limit instead. We can use Proposition 
13.41 to establish an important property of the proposed estimator in such a limit. 

Theorem 6.2: Let R and R be two n x n sized covariance matrices whose eigenvalues are related as 

A = diag(Ai, . . . , Ap, Ap+i, . . . , A^, A, . . . , A) (31a) 

A = diag(Ai,...,Ap,A, ...,A) (31b) 

where for some c G (0, oo), and all i = p + 1, . . . ,k, A < Aj < A (1 + y/c). Let R and R be the associated sample 
covariance matrices formed from m snapshots. Then for every n,m{n) — > oo such that Cm = n/m ^ c, 

Prob(fe = j I R) ^ Prob(A? = j | R) for j = 1 , . . . , p (32a) 

and 

Prob(A; > p | R) ^ Prob(A; > p | R) (32b) 

where the convergence is almost surely and k is the estimate of the number of signals obtained using the algorithm 
in dg. 

Proof: The result follows from Proposition 13.41 The almost sure convergence of the sample eigenvalues 

Ij — > A(l + y/cy^ for j = p + 1, . . . ,k implies that i-th largest eigenvalues of R and R, for i = 1, . . . ,p + 1, 
converge to the same limit almost surely. The fluctuations about this limit will hence be identical so that (l32l ) 
follows in the asymptotic limit. ■ 
Note that the rate of convergence to the asymptotic limit for Prob(A; > p|R) and Prob(A; > p | R) will, 
in general, depend on the eigenvalue structure of R and may be arbitrarily slow. Thus, Theorem 16.21 yields 
no insight into rate of convergence type issues which are important in practice. Rather, the theorem is a statement 
on the asymptotic equivalence, from an identifiability point of view, of sequences of sample covariance matrices 
which are related in the manner described. At this point, we are unable to prove the consistency of the proposed 
estimator as this would require more a refined analysis that characterizes the fluctuations of subsets of the (ordered) 
"noise" eigenvalues. The statement regarding consistency of the proposed estimator, in the sense of large system, 
large sample limit, is presented as a conjecture with numerical simulations used as non-definitive yet corroborating 
evidence. 
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Conjecture 6.3: Let R be a n x n covariance matrix that satisfies tlie hypothesis of Proposition 13.41 Let R be a 
sample covariance matrix formed from m snapshots. Define 

A;cfr(c I R) := Number of eigenvalues of R > A(l + ^/c). (33) 

Then in m, n — > oo limit with Cm = n/m ^ c, k is a consistent estimator of A;efr(c) where k is the estimate of 
the number of signals obtained using the algorithm in 



Motivated by Proposition 13.41 we (heuristically) define the effective number of ( identifiable ) signals as 



fceff(R) = # eigs. of R > (^1 + ^ - J . (34) 

Conjecture 16.31 then simply states that the proposed estimator is a consistent estimator of the effective number of 
(identifiable) signals in the large system, large sample limit. 



A. The asymptotic identifiability of two closely spaced signals 

Suppose there are two uncorrelated (hence, independent) signals so that Rs = diag(crg^, cjgg). In ([J) let A = 
[V1V2]. In a sensor array processing application, we think of vi = v(^i) and V2 = V2(^2) as encoding the array 
manifold vectors for a source and an interferer with powers a^^ and cjgg, located at 61 and 62, respectively. The 
covariance matrix given by 

R = a'^i^iv'i + (Tg2V2V2 + cr^I (35) 
. . = Xn = and the two largest eigenvalues 



has the n — 2 smallest eigenvalues A3 
Ai 



2 , {4i II VI IP +^S2 II vain , 

cr H h 



A, 



2 , i4l II V I IP +42 I|v2f) 

cr H 



yii l|viP-ai2 ||V2||2) +4(jiia|2l(vi,V2) 



>S1 l|viP-a|2 IIV2P) +4(T|iCj|2l(vi,V2) 



2 2 
respectively. Applying the result in Proposition 13.41 allows us to express the effective number of signals as 



(36a) 
(36b) 



^eff 



if A2 < CT^ 1 + 



^)<A, 

m , 



(37) 







if Ai < 



m 



In the special situation when || vi || = || V2 || = || v || and o-g^ = (Tg2 = cxg, we can (in an asymptotic sense) reliably 
detect the presence of both signals from the sample eigenvalues alone whenever 



Asymptotic identifiability condition : 



4 l|vf ( 1 



(Vl, V2)| 



> a 



(38) 



Equation (1381 ) captures the tradeoff between the identifiability of two closely spaced signals, the dimensionality of 
the system, the number of available snapshots and the cosine of the angle between the vectors vi and V2. 

We note that the concept of the effective number of signals is an asymptotic concept for large dimensions and 
relatively large sample sizes. For moderate dimensions and sample sizes, the fluctuations in the "signal" and "noise" 
eigenvalues affect the reliability of the underlying detection procedure as illustrated in Figure [4(b)| From Proposition 
we expect that the largest "noise" eigenvalue will, with high probability, be found in a neighborhood around 



(7^(1 + Y^n/m)^ while the "signal" eigenvalues will, with high probability, be found in a neighborhood around 
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3 2 1 

(b) When n and m are not quite large enough so that the (largest) "signal" eigenvalue is not 
sufficiently separated from the largest "noise" eigenvalue, then reliable detection becomes challenging. 

Fig. 4: The finite system dimensionality and sample size induced fluctuations of "signal" (blue) and "noise" (black) 
eigenvalues about their limiting positions are shown. The magnitude of the fluctuations impacts the ability to 
discriminate the "signal" eigenvalue from the largest "noise" eigenvalue. 



( ^ m(A "o-^) j- Fro™ Proposition 13.51 we expect the "signal" eigenvalues to exhibit Gaussian fluctuations with 

This motivates our definition of the metric Z^^^ 



a standard deviation of approximately -^^j — rn(Xj~a^y 
given by 




(39) 



then measures the (theoretical) separation of the j-th "signal" eigenvalue from the largest "noise" eigenvalue in 
standard deviations of the j-the signal eigenvalue's fluctuations. Simulations suggest that reliable detection (with an 
empirical probability greater than 90%) of the effective number of signals is possible if Z^'^^ is larger than 5 — 15. 
This large range of values for the minimum Zf P, which we obtained from the results in Section IVIII suggests 
that a more precise characterization of the finite system, finite sample performance of the estimator will have to 
take into account the more complicated-to-analyze interactions between the "noise" and the "signal" eigenvalues 
that are negligible in the large system, large sample limit. Nonetheless, because of the nature of the random matrix 
results on which our guidance is based, we expect our heuristics to be more accurate in high-dimensional, relatively 
large sample size settings than the those proposed in [42] and [6] which rely on Anderson's classical large sample 
asymptotics. 



VII. Numerical simulations 

We now illustrate the performance of our estimator using Monte-Carlo simulations. The results obtained provide 
evidence for the consistency properties conjectured in Section |Vll In all of our simulations, we use a population 
covariance matrix R that has arbitrarily fixed, yet unknown, eigenvectors, k = 2 "signal" eigenvalues with Ai = 10 
and A2 = 3, and n — 2 "noise" eigenvalues with A3 = . . . = A„ = A = o"^ = 1. We assume that the snapshot 
vectors Xj modelled as in ([T|) are complex valued so that we must plug in /3 = 2 in (|9ll; the choice of complex 
valued signals is motivated by our focus on array signal processing/wireless communications applications. 
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Over 4000 Monte-Carlo simulations, and various n and m, we obtain an estimate of the number of signals from 
the eigenvalues of the sample covariance matrix using our new estimator and the modified Wax-Kailath estimator, 
described in Q and ([S]) respectively. We do not consider the Wax-Kailath AIC estimator in Q in our simulations 
because of its proven [2] inconsistency in the fixed system size, large sample limit - we are interested in estimators 
that exhibit the consistency conjectured in Section |Vl] in both asymptotic regimes. A thorough comparison of the 
performance of our estimator with other estimators (and their ad-hoc modifications) found in the literature is beyond 
the scope of this article. 

We first investigate the large sample consistency in the classical sense of n fixed and m — > oo. For a choice of 
n, and different values of m we compute the empirical probability of detecting two signals. For large values of m 
we expect both the new and the Wax-Kailath MDL estimator to detect both signals with high probability. Figure 
|5] plots the results obtained in the numerical simulations. 

Figure [5(a)] shows that for n = 32, 128, if m is large enough then either estimator is able to detect both signals 
with high probability. However, the new estimator requires significantly less samples to do so than the Wax-Kailath 
MDL estimator. 

Figures |5(b)| and 5(c) plot the empirical probability of detecting one and zero signals, respectively, as a function 
of m for various values of n. The results exhibit the chronically reported symptom of estimators underestimating 
the number of signals - this is not surprising given the discussion in Section |Vll Figure [6] plots the effective number 
of identifiable signals k^jf, determined using (l34l ) for the various values of n and m considered. We observe that 
the values of n and m for which the empirical probability of the new estimator detecting one signal is high also 
correspond to regimes where k^ff = 1. This suggests that the asymptotic concept of the effective number of signals 
remains relevant in a non-asymptotic regime as well. At the same time, however, one should not expect the signal 
identifiability/unidentifiability predictions in Section |Vl] to be accurate in the severely sample starved settings where 
m <sCn. For example. Figure [5(c)] reveals that the new estimator detects zero signals with high empirical probability 
when there are less than 10 samples available even though /cg// = 1 in this regime from Figure [6] In the large 
system, relatively large sample size asymptotic limit, however, these predictions are accurate - we discuss this next. 

When m = An samples are available. Figure |7(a)| shows that the proposed estimator consistently detects two 
signals while the Wax-Kailath MDL estimator does not. However, when m = n/4 samples are available. Figure 



7(a) suggests that neither estimator is able to detect both the signals present. A closer examination of the empirical 



data presents a different picture. The population covariance has two signal eigenvalues Ai = 10 and A2 = 3 with 
the noise eigenvalues cr^ = 1. Hence, when m = n/A, from ( [33] ). the effective number of signals /cg// = 1- Figure 
[7(b)[ shows that for large n and m = n/4, the new estimator consistently estimates one signal, as expected. We 
remark that that the signal eigenvalue A2 which is asymptotically unidentifiable falls exactly on the threshold in ([33] ). 
The consistency of the new estimator with respect to the effective number of signals corroborates the asymptotic 
tightness of the fundamental limit of sample eigenvalue based detection. On inspecting Tables [Il-(b) and [Il-(d) it 
is evident that the Wax-Kailath MDL estimator consistently underestimates the effective number of signals in the 
large system, large sample size limit. 

Table [II] provides additional evidence for Conjecture 16.31 We offer Tables [II]-(c) and [TI]-(d) as evidence for the 
observation that large system, large sample consistency aside, the rate of convergence can be arbitrary slow and 
cannot be entirely explained by the metric -^f in ([39] ). 
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log,„m 

(a) Empirical probability that fc = 2 for various n and m. 




iog,„m 

(b) Empirical probability that k — 1 for various n and m. 




iog,„m 



(c) Empirical probability that fc = for various n and m. 

Fig. 5: Comparison of the performance of the new estimator in (|9]l with the MDL estimator in ^ for various n 
(system size) and m (sample size). 
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n 


m 


n/m 






P{k = 0) 


P{k = 1) 


P{k = 2) 


P{k = 3) 


32 

64 
128 
256 
512 
1024 


128 

256 

512 

1024 

2048 

4096 


0.25 
0.25 
0.25 
0.25 
0.25 
0.25 


2 
2 
2 
2 
2 
2 


2.1909 
3.0984 
4.3818 
6.1968 
8.7636 
12.3935 




0.0035 


n 9898 
0.9922 
0.9930 
0.9938 
0.9948 
0.9952 


0.0067 
0.0077 
0.0070 
0.0063 
0.0053 
0.0047 






(a) The performance of the new estimator when m 


= 4n. 




n 


m 


n/m 




^SEP 


P{k = 0) 


P{k = 1) 


P{k = 2) 


P{k = 3) 


32 

64 
128 
256 
512 
1024 


128 

256 

512 

1024 

2048 

4096 


0.25 
0.25 
0.25 
0.25 
0.25 
0.25 


2 
2 
2 
2 
2 
2 


2.1909 
3.0984 
4.3818 
6.1968 
8.7636 
12.3935 




0.6920 
0.9663 
1.0000 
1.0000 
1.0000 
1.0000 


0.3080 
0.0338 




(b) The performance of the Wax-Kailath MDL based estimator when m = An. 


n 


m 


n/m 




^SEP 


P{k = 0) 


P{k = 1) 


P{k = 2) 


P[k = 3) 


32 

64 
128 
256 
512 
1024 


8 

16 
32 
64 
128 
256 


4 
4 
4 
4 
4 
4 


1 
1 
1 
1 
1 
1 


3.1588 

4.4673 

6.3177 

8.9345 

12.6353 

17.8690 


0.1867 
0.0260 
0.0013 


0.7920 
0.9537 
0.9898 
0.9972 
0.9998 


0.0213 
0.0203 
0.0090 
0.0027 
0.0003 








(c) The performance of the new estimator when m = 


= n/4. 




n 


m 


n/m 




^SEP 


P{k = 0) 


P{k = 1) 


P{k = 2) 


P{k = 3) 


32 

64 
128 
256 
512 
1024 


8 

16 
32 
64 
128 
256 


4 
4 
4 
4 
4 
4 


1 
1 
1 
1 
1 
1 


3.1588 

4.4673 

6.3177 

8.9345 

12.6353 

17.8690 


0.9900 
0.9998 
1.0000 
1.0000 
1.0000 
1.0000 


0.0100 
0.0002 







(d) The performance of the Wax-Kailath MDL based estimator when m = n/A. 

TABLE I: Comparison of the empirical performance of the new estimator in ^ with the Wax-Kailath MDL 
estimator in (HJl when the population covariance matrix has two signal eigenvalues Ai = 10 and A2 = 3 and n — 2 
noise eigenvalues A3 = . . . = = o"^ = 1. The effective number of identifiable signals is computed using ( [34l ) 
while the separation metric Z^^^ is computed for j = k^ff using ( [39l ). Here n denotes the system size, m denotes 
the sample size and the snapshot vectors, modelled in ([T|l are taken to be complex-valued. 
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n 


m 


n/m 




^SEP 


P{k = 0) 


P{k = 1) 


P{k = 2) 


P{k = 3) 


32 
64 
128 

256 
512 
1024 


4 
8 

16 

32 
64 
128 


8 
8 
8 
8 
8 
8 


1 
1 
1 
1 
1 
1 


2.5218 
3.5663 
5.0435 
7.1326 
10.0871 
14.2653 


0.8822 
0.6937 
0.5178 
0.3450 
0.1993 
0.0745 


0.1177 
0.3063 
0.4823 
0.6550 
0.8007 
0.9255 






(a) The performance of the new estimator when m = n/8. 


n 


m 


n/m 




^SEP 


P{k = 0) 


P{k = 1) 


P(fc = 2) 


P{k = 3) 


32 

64 
128 
256 
512 
1024 


4 

8 
16 
32 
64 
128 


8 
8 
8 
8 
8 
8 


1 
1 
1 
1 
1 
1 


2.5218 
3.5663 
5.0435 
7.1326 
10.0871 
14.2653 


0.9995 
1.0000 
1.0000 
1.0000 
1.0000 
1.0000 


0.0005 






(b) The performance of the Wax-Kailath MDL based estimator when m = n/8. 


n 


m 


n/m 




7SEP 


P{k = 0) 


P{k = 1) 


P{k = 2) 


P{k = 3) 


32 
64 
128 

256 
512 
1024 


32 

64 
128 
256 
512 
1024 


1 
1 
1 
1 
1 
1 


2 
2 
2 
2 
2 
2 


1.0887 
1.5396 
2.1773 
3.0792 
4.3546 
6.1584 




0.3608 
0.2727 
0.2130 
0.1895 
0.1812 
0.1665 


0.5972 
0.6805 
0.7352 
0.7592 
0.7678 
0.7780 


0.0395 
0.0445 
0.0505 
0.0485 
0.0495 
0.0537 


(c) The performance of the new estimator when m = n. 


n 


m 


n/m 






P{k = 0) 


P{k = 1) 


P{k = 2) 


P{k = 3) 


32 

64 
128 
256 
512 
1024 


32 

64 
128 
256 
512 
1024 


1 
1 
1 
1 
1 
1 


2 
2 
2 
2 
2 
2 


1.0887 
1.5396 
2.1773 
3.0792 
4.3546 
6.1584 


0.0040 
0.0008 

0.0003 
0.0030 


0.9955 
0.9992 
1.0000 
1.0000 
0.9998 
0.9970 


0.0005 





(d) The performance of the Wax-Kailath MDL based estimator when m = n. 

TABLE II: Comparison of the empirical performance of the new estimator in ^ with the Wax-Kailath MDL 
estimator in (HJ when the population covariance matrix has two signal eigenvalues Ai = 10 and A2 = 3 and n — 2 
noise eigenvalues A3 = ... = A„ = cr^ = 1. The effective number of identifiable signals is computed using (l34l) 
while the separation metric 2^^^ is computed for j = k^jf using ( [39l ). Here n denotes the system size, m denotes 
the sample size and the snapshot vectors, modelled in ([T]) are taken to be complex-valued. 
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Transition occurs for n = 32 





Transition occurs for n = 256 



• • • • 



••••• • ♦ • • ♦ ♦ 
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n = 32 


★ 


n = 128 


T 


n = 256 



Transition occurs for n = 128 



11 
2 

loq.„ m 



Fig. 6: The effective number of identifiable signals, computed using (1341 ) for the values of n (system size) and m 
(sample size) considered in Figure [5] when the population covariance matrix has two signal eigenvalues Ai = 10 
and A2 = 3 and n — 2 noise eigenvalues A3 = . . . = A„ = cr^ = 1. 



SAMPLE EIGENVALUE BASED DETECTION 



19 



1 

0.9 

0.8 

0.7 

c7 0.6 
II 

^ 0.5 
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0.2 
0.1 





1 



0.8 
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-NEW: m 


= n/4 




-NEW: m 


= 4n 




-MDL:m 


= n/4 


-* 


-MDL:m 


= 4n 



(a) Empirical probability that k — 2 for various n and fixed n/m. 

n — B — 



-B — NEW: m = n/4 
-e— NEW: m = 4n 



l°9io" 

(b) Empirical probability that k — 1 for various n and fixed n/m. 

Fig. 7: Comparison of the performance of the new estimator in ^ with the MDL estimator in ([8]) for various values 
of n (system size) with m (sample size) such that n/in is fixed. 
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VIII. Concluding remarks 

We have developed an information theoretic approach for detecting the number of signals in white noise from 
the sample eigenvalues alone. The proposed estimator explicitly takes into account the blurring of the sample 
eigenvalues due to the finite size. The stated conjecture on the consistency of the algorithm, in both the n fixed, 
m — > oo sense and the n,m{n) — > oo with n/m{n) — > c sense remains to be proven. It would be interesting 
to investigate the impact of a broader class of penalty functions on the consistency, strong or otherwise, in both 
asymptotic regimes, in the spirit of [23]. 

In future work, we plan to address the problem of estimating the number of high-dimensional signals in noise 
with arbitrary co variance [24], using relatively few samples when an independent estimate of the noise sample 
covariance matrix, that is itself formed from relative few samples, is available. This estimator will also be of the 
form in |9] and will exploit the analytical characterization of properties of the traces of powers of random Wishart 
matrices with a covariance structure that is also random [43]. 

It remains an open question to analyze such signal detection algorithms in the Neyman-Pearson sense of finding 
the most powerful test that does not exceed a threshold probability of false detection. Finer properties, perhaps 
buried in the rate of convergence to the asymptotic results used, might be useful in this context. In the spirit of Wax 
and Kailath's original work, we developed a procedure that did not require us to make any subjective decisions on 
setting threshold levels. Thus, we did not consider largest eigenvalue tests in sample starved settings of the sort 
developed in [31], [44] and the references therein. Nevertheless, if the performance can be significantly improved 
using a sequence of nested hypothesis tests, then this might be a price we might be ready to pay. This is especially 
true for the detection of low-level signals right around the threshold where the asymptotic results suggest that it 
becomes increasingly difficult, if not impossible, to detect signals using the sample eigenvalues alone. 
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